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Abstract. We extend the theory of fast Fourier transforms on finite groups 
to finite inverse semigroups. We use a general method for constructing the 
irreducible representations of a finite inverse semigroup to reduce the prob- 
lem of computing its Fourier transform to the problems of computing Fourier 
transforms on its maximal subgroups and a fast zeta transform on its poset 
structure. We then exhibit explicit fast algorithms for particular inverse semi- 
groups of interest — specifically, for the rook monoid and its wreath products 
by arbitrary finite groups. 



1. Introduction 

Given a complex-valued function / on a finite group G, we may view / as an 
element of the group algebra CG by identifying the natural basis of CG with the 
characteristic functions of the elements g E G. That is, 

gee 

corresponds to 

J2 fi9)9 e CG. 
sec 

Because CG is a semisimple algebra, it is the direct sum of its minimal (two-sided) 
ideals M^: 

CG = Mi® •••eM„. 
By taking a basis for each of the Mj subject to a "normalization" condition ex- 
plained in Section [3l we obtain a basis for CG known as a Fourier basis. The 
Fourier transform of a function / is then its re-expression in terms of a Fourier 
basis. 

As an example, let G = Z„, the cyclic group of order n. An element / of the 
group algebra CZ„ expressed with respect to the natural basis may be viewed as a 
signal, sampled at n evenly spaced points in time. In this case, the minimal ideals of 
CZ„ are all 1-dimensional, so a Fourier basis must be unique up to scaling factors, 
and the Fourier basis here is indeed the usual basis of exponential functions given 
by the classical discrete Fourier transform. The re-expression of / in terms of a 
Fourier basis thus corresponds to a re-expression of / in terms of the frequencies 
that comprise /. This change of basis may be computed efficiently with the help 
of the classical fast discrete Fourier transform (FFT). 
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A naive computation the Fourier transform of / G CG requires |Gp operations. 
An operation is defined to be a complex multiplication followed by a complex ad- 
dition. The problem of efficiently computing the Fourier transform of an arbitrary 
C-valued function on G has been considered for a wide range of groups G, and 
efficient algorithms for computing this change of basis now exist for many finite 
groups. For a survey of these results see, e.g., [6l [TH [TH [151 [161 [20] . For example, 
it is known that the Fourier transform of / S CG requires no more than: 

• O(nlogn) operations if G = Z„ [HE], 

• 0(|S'„| log \Sn\) operations if G = Sn, the symmetric group on n elements 
[ig, and 

• G(|i3„| log'' |i3„|) operations if G = i?„, the hyperoctahedral group (that 
is, the signed symmetric group) on n elements 19 . 

We shall define a fast Fourier transform (FFT) for (or on) a finite group G to 
be a procedure for calculating the Fourier transform of an arbitrary complex- 
valued function on G which compares favorably to the naive algorithm. In general, 
0(|G| log^ |G|) algorithms are the goal in group FFT theory, although there exist 
families of groups G for which there exist greatly improved — yet not 0(|G| log'^ |G|) — 
algorithms, such as the family of matrix groups over a finite field [13j . 

In [11] we extended the theory of finite-group FFTs to create FFTs for a particu- 
lar inverse semigroup known as the rook monoid. In this paper we extend the theory 
of finite-group FFTs to all finite inverse semigroups. In particular, we provide a 
method for building FFTs on arbitrary finite inverse semigroups and we construct 
0{\S\ log^ 1 5*1) FFTs for specific inverse semigroups S of interest. Our main results 
are these. 

Theorem (Theorem l4.9|) . Let S be a finite inverse semigroup with V -classes Dq, ...,£); 
Let rk denote the number of idempotents in Dk- Choose an idempotent Ck from each 
V-class Dk, and let Gk be the maximal subgroup of S at Ck- Then the number of 
operations required to compute the Fourier transform of an arbitrary 'C-valued func- 
tion f on S is no more than 

n 

c{(:s) + J2^lc{Gk), 

where C{Cs) is the maximum number of operations needed to compute the zeta trans- 
form of f on S andC{Gk) is the maximum number of operations needed to compute 
the Fourier transform of an arbitrary C-valued function on Gk- 

Theorem (Theorem 15.51) . Lf S — Rn, the rook monoid on n elements, then the 
Fourier transform of an arbitrary C-valued function on S may be computed in 
0(|S'|log^ IS'I) operations. 

Theorem (Theorem l6.12p . If G is a finite group and S = GlRn, the wreath product 
of Rn with G, then the Fourier transform of an arbitrary C-valued function on S 
may be computed in 0(151 log^ \S\) operations. 

We proceed as follows. In Section[2]we review basic concepts related to semigroup 
theory and we list a few properties of the most important inverse semigroup, the 
rook monoid. In Section [3] we review some representation theory related to inverse 
semigroups, we define the notion of the Fourier transform on an inverse semigroup, 
and we make precise the notion of a fast Fourier transform. In Section [4] we create a 
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general framework for constructing FFTs on finite inverse semigroups. In particular, 
we explain how the problem of computing the Fourier transform on an inverse 
semigroup may be reduced to the problems of computing Fourier transforms on 
its maximal subgroups and a zeta transform on its poset structure. In Section [5] 
we proceed according to our general framework to construct an FFT for the rook 
monoid. In Section [6] we generalize this FFT to an FFT for wreath products of the 
rook monoid with arbitrary finite groups. Section [7| contains thoughts on future 
directions for this line of research. 



A semigroup 5 is a nonempty set with an associative, binary operation, which 
we will write multiplicatively. If there is an identity element for the multiplication 
in S, then S is said to be a monoid. A group is a monoid where every element has 
a (unique) multiplicative inverse. In this paper we are concerned with the class of 
semigroups known as inverse semigroups. 

Definition 2.1. An inverse semigroup is a semigroup S such that, for each x € S, 
there is a unique y (z S such that 

xyx = X and yxy = y. 

In this case, we write y — x~^ . 

We remark that the condition that y be unique is necessary in this definition. 
An element a; G 5 is said to be regular or Von-Neumann regular if there is at least 
one y € S satisfying xyx = x and yxy — y, and S is said to be regular if every 
element of S is regular. Consider the full transformation semigroup T„ on the set 
{1, 2, . . . , n}, that is, all maps from {1, 2, . . . , n} to itself under composition. It is 
easy to see that T„ is regular, and that (for n > 2) there exist elements x e T„ 
for which there are multiple elements y € X satisfying xyx = x and yxy ~ y. T„ 
is therefore not inverse. An equivalent characterization of inverse semigroups (see, 
e.g., [To]) is as follows. 

Theorem 2.2. An inverse semigroup is a semigroup S which is regular and in 
which all idempotents of S commute. 

The most important finite inverse semigroup is the rook monoid (also called the 
symmetric inverse semigroup) on n elements, which we denote by It is the 
semigroup of all injective partial functions from {1, . . . , n} to itself under the usual 
operation of partial function composition. In this paper, we adopt the convention 
that maps act on the left of sets, and so, for g, / G Rn, g°f is defined for precisely the 
elements x for which x G dom(/) and f{x) G dom(f;). i?„ is called the rook monoid 
because it is isomorphic to the semigroup of all n x n matrices with the property 
that at most one entry in each row is 1 and at most one entry in each column is 
1 (the rest being 0) under matrix multiplication, and such matrices (called rook 
matrices) correspond to the set of all possible placements of non-attacking rooks 
on an n X n chessboard. For example, consider the element cr G i?4 defined by 



2. Inverse Semigroups 



(7(2) = 1, (7(4) =4. 
Then, viewed as a partial permutation, a is 
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where the dash indicates that the above entry is not mapped to anything. As a 
rook matrix, we have 

"0100" 


" ■ 

1 

It is easy to see that Rn is indeed an inverse semigroup, as the unique semigroup- 
inverse of a rook matrix is its transpose. 

The rook monoid is a generahzation of the symmetric group (as the symmetric 
group Sn is contained in i?„ as the set of elements with nonzero determinant), and 
it plays the same role for finite inverse semigroups as the symmetric group does for 
finite groups in the following variation of Cayley's theorem [10, p. 36-37]. 

Theorem 2.3. Let S be a finite inverse semigroup. Then there exists n G Z for 
which S is isomorphic to a suhsemigroup of Rn- 

It is often useful to consider subsets of i?„ whose elements have domains of a 
certain size. 

Definition 2.4. Given an element a G i?„, the rank of a, denoted rk(cr), is defined 
to be rk(tT) = |dom(o')| = |ran(o')|. It is clear that the rank of a is the same as the 
rank of the associated rook matrix. 

We have the following theorem on the size of i?„. 

Theorem 2.5. 

k=0 ^ ' 

Proof. For any particular rank k, there are ('^) choices for the domain and (^') 
choices for the range of an element of i?„ , and for any particular choice of domain 
and range, there are fc! ways of mapping the domain to the range. □ 

We also have the recursive formula, which we will prove in Section [S] 

Theorem 2.6. For n> 3, 

|i?„| = 2n|i?„_i|-(n-l)2|i?„_2|. 

3. Fourier transforms for Inverse Semigroups 

For the rest of this paper, S will denote a finite inverse semigroup. 

Definition 3.1. The semigroup algebra of S over C, denoted CS, is the formal 
C-span of the symbols {s}sg5. Multiplication in CS*, denoted by is given by 
convolution (i.e., the linear extension of the semigroup operation via the distributive 
law): Suppose /, g G CS", with 

/ = ^/(r)r, g = 

res tes 

Then 

reS teS ses 7\teS:rt=s 
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Remark: If is a group, then convolution may be written in the famihar way; 

Let f : S ^ C he any complex- valued function on S. We view / as an ele- 
ment of the semigroup algebra CS by associating the natural basis of CS with the 
characteristic functions of the elements s ^ S. So 

corresponds to 

^/(s)sGC5. 

Thus, CS is the algebra of all C- valued functions on S. The natural basis {sj^gs 
of CS" is also called the semigroup basis. 

Definition 3.2. A representation p (of dimension dp E N) of the semigroup algebra 
CS is an algebra homomorphism 

p:CS ^ Md^iC). 

Equivalently, a representation p of CS is a dp-dimensional C- vector space which 
is also a left CS'-module. 

Definition 3.3. A representation p of CS is said to be null if p{a) is the zero 
matrix for all a G CS*. 

Definition 3.4. A representation p of CS' is said to be irreducible if it is non-null 
and simple as a left C5-module. That is, p is irreducible if p ^ and there is no 
4-tuple {X, pi, p2, g) , where X is an invertible matrix, pi and p2 are representations 
of CS, and g is a matrix-valued function on CS, for which 

Xpia)X-^ = ( Pf] ^ ) 
V 9W P2{a) J 

for all a e CS. 

Definition 3.5. Two representations pi and p2 of CS" are equivalent if there is an 
invertible matrix X for which 

Xpi{a)X-^ ^ P2ia) 

for all a e CS*. That is, two representations are equivalent if they are isomorphic 
as left CS-modules. 

Let S be a finite inverse semigroup. In [18, Theorem 4.4], Munn proves that the 
semigroup algebra CS is semisimple, and we therefore have: 

Theorem 3.6. Any representation ofCS is equivalent to a direct sum of irreducible 
and null representations of CS . 

Furthermore, Wedderburn's theorem applies to CS*. 

Theorem 3.7 (Wedderburn's theorem). Let S be a finite inverse semigroup. Let 
y be a complete set of inequivalent, irreducible representations of CS . Then y is 
finite, and the map 

(1) 0p:C5^0M,^(C) 

P&y pay 
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is an isomorphism of algebras. Explicitly, let f G CS where f = J2ses fi^)^- Then 

in this isomorphism. 

We also have the following formula for the sum of the squares of the dimensions 
of the irreducible representations of S. 

Corollary 3.8. Let S be a finite inverse semigroup, and let y be a complete set of 
inequivalent, irreducible representations ofCS. Then 

(2) I^I = E^P- 

pey 

Proof. The formula ([2|) is just the C-dimensionality of the algebras appearing in 
©. □ 

Let S' be a finite inverse semigroup and let / £ CS with 

f^Y.f('>- 

Definition 3.9. If p is a representation of CS, then we define the Fourier transform 
of f at p, denoted by f{p), by 

/(p) = p(/) = E/(^)/'(^)- 

Let 3^ be a complete set of inequivalent, irreducible representations of CS. The 
map in Wedderburn's theorem obtained by "gluing" together the elements of y is 
called a Fourier transform on (or for) S. Specifically, we have 

Definition 3.10. The element of Qpf=y Md^{C) given by 

peyses 

is called the Fourier transform of f with respect to (or relative to) y. 

Consider the inverse image of the natural basis of the algebra on the right in 
(H]), that is, the inverse image of the set of matrices in the algebra on the right 
having exactly one entry equal to 1 (the rest being 0). This is the target basis for 
the Fourier transform on CS*. Each block of the algebra on the right is a minimal 
two-sided ideal. The inverse image of the basis for a single block is a therefore a 
basis for a minimal two-sided ideal of CS. Since the map in ([T]) is an isomorphism, 
the inverse images of distinct columns have intersection {0}, and this basis for CS* 
therefore realizes the decomposition of CS* into the direct sum of its minimal ideals. 

Definition 3.11. Let 5' be a finite inverse semigroup and let 3^ be a complete set 
of inequivalent, irreducible representations of CS. The inverse image of the natural 
basis of the algebra on the right in ([1]) is called a Fourier basis for CS*. More 
specifically, it is the Fourier basis for CS according to y. It is also known as the 
dual matrix coefficient basis for CS relative to y [12] ■ 
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When we refer to a Fourier basis for CS, we mean any basis of CS that can 
arise in this manner by choosing an appropriate set of representations y. To see 
why we consider this is a "normahzation" condition, consider 5 = Z„. Every ir- 
reducible representation of CZ„ has dimension 1, so the notion of equivalence of 
representations for CZ„ reduces to the notion of equality. Specifically, the irre- 
ducible representations of CZ„ are the Xk (fc = 0, 1, . . . , n — 1), given on the natural 
basis Z„ by 

Xk[J)=e " . 

In this case, the isomorphism ([T]) is the usual discrete Fourier transform: 

and the associated Fourier basis of CZ„ is the usual basis of exponential functions 
{&fc}fcZo' normalized so that hk{xk) = 1: 

n — 1 

i X ^ 2Tvijk . 

Dfe = - > e " J. 
n ^-^ 

Here is the general convolution theorem. 

Theorem 3.12. The Fourier transform on S turns convolution into multiplication 
of block- diagonal matrices. The Fourier transform turns convolution into pointwise 
multiplication if and only if every irreducible representation of CS has dimension 
one. 

Proof. Since the map given in Wedderburn's theorem is a homomorphism, it turns 
multiplication in CS (that is, convolution) into multiplication in ^p^y Mdp(C). □ 

We begin our study of the computational complexity of the Fourier transform 
on S by introducing some notation. 

Definition 3.13. Let 3^ be a complete set of inequivalent, irreducible representa- 
tions of CS. Suppose that all representations in y are precomputed (that is, evalu- 
ated at every element of some basis of CS — usually the standard {sjssS basis) and 
stored in memory. Then the maximum number of operations (where an operation 
is defined to be a complex multiplication together with a complex addition) needed 
to compute the Fourier transform of an arbitrary function / = Sses fi-^)^ S CS* 
is denoted by 

ryis). 

Now, let y vary over all sets of inequivalent, irreducible representations of CS. We 
define 

C{S) = mm{Ty{S)}. 

Given a particular inverse semigroup S, the goal is to bound C{S). This is 
often accomplished by constructing a computationally advantageous set of repre- 
sentations y of CS and proving a bound on Ty{S) that compares favorably to the 
number of operations needed to compute the Fourier transform by naive methods, 
as given in the following theorem. 

Theorem 3.14. For any inverse semigroup S, C{S) < \S\'^. 
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Proof. Let y be any eomplete set of inequivalent, irreducible representations of CS. 
If / = J2s€S ^ '^'^^ then a naive computation of ([T]) requires at most 



^\s\d'^ = \s\J2dl^\s\' 

pey pey 



operations, the last equality arising from corollarv l3.8l □ 

A fast Fourier transform (FFT) on (or for) an inverse semigroup S is then an 
algorithm for computing the Fourier transform of an arbitrary C-valued function 
on S whose computational complexity compares favorably to that of the naive 
algorithm. 

4. FFTs FOR Inverse Semigroups — A General Approach 

In this section, we explain a theorem of B. Steinberg [23l Theorem 4.6] and we 
use it to reduce the problem of creating FFTs on finite inverse semigroups to the 
problems of creating FFTs on their maximal subgroups and fast zeta transforms 
on their poset structures. 

4.1. The Groupoid Basis. Let 5 be a finite inverse semigroup. We begin by 
recalling the natural poset structure of S [51 1101 [^ . 

Definition 4.1. Let S be a finite inverse semigroup. For s,t E S, define 

s < t s = et for some idempotent e e 5 

s ~ tf for some idempotent f (z S. 

For Rn, the idempotents are the restrictions of the identity map, and this order- 
ing is the same as the ordering 

s < t <=^ t extends s as a partial function. 

Remark: If 5 is a group, then its poset structure is trivial in the sense that 
s <t <J=^ s = t. 

If P is a finite poset, then the zeta function C of P is given by: 

C:PxP^{0,l} 

iix<y, 
otherwise. 

The zeta function is invertible over any field F (in fact, over any ring with unity), 
and its inverse is called the Mobius function fi. The Mobius function for P„ over 
C is well known [22l|23]. It is, for x < y, 

fi{x,y) = (-iyHv)-rHx) ^ 

We have already seen one natural basis for the semigroup algebra CS', the semigroup 
basis {s}sgs. Multiplication in <CS with respect to this basis is just the linear 
extension of the multiplication in S. In [23], B. Steinberg defines another "natural" 
basis of CS. To motivate this new basis, recall that every finite inverse semigroup 
is isomorphic to a subsemigroup of a rook monoid and can therefore be viewed 
as a collection of partial functions. There is another model for composing partial 
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functions — only allow the composition if the range of the first function "lines up" 
with the domain of the second. For example, if 

1 2 3 4 \ _ / 1 2 

2-1-y' ^ ~ \ 4 3 

then the idea is that the composition n o a is 

12 3 4 



3 - 4 - 

and the composition a o n is disallowed. The groupoid basis of CS encodes this. 
Definition 4.2. Define, for each s ^ S, the element [sj £ CS by 

[s\ = J2 /^(*' 

teS:t<s 

The collection {[sj}sg5 is called the groupoid basis of CS*. 

Viewing S' as a subsemigroup of R„, then, we have 

Theorem 4.3. The groupoid basis is a basis for <CS . Multiplication in CS relative 
to this basis is given by the linear extension of 



if dom(s) = ran(i), 
otherwise. 



Furthermore, the change of basis to the {s}seS basis of CS is given by Mobius 
inversion: 



(3) 5= E w 



teS:t<s 

Proof. This is [23| Lemma 4.1 and Theorem 4.2], using our convention that maps 
act on the left of sets. □ 

The notions of dom(s) and ran(s) may also be defined intrinsically in terms of S, 
i.e., without reference to an embedding of S into Specifically, for any element 
s G S, ss~^ and s~^s are idempotent, and one may define 

dom(s) — s~^s 

ran(s) = ss^^. 

If we use this definition for s e i?„, we see that dom(s) is actually not the domain 
of s, but is rather the map which is the identity on the domain of s and undefined 
elsewhere, and likewise for ran(s). This means that we are abusing the distinction 
between the domain and range of a map and the corresponding partial identities. 
Under this definition, we have that the groupoid basis of CS* multiplies as follows: 

(4) L^J W = 

The viewpoint, then, is that we have two "natural bases" for CS, the semigroup 
basis {sjses and the groupoid basis {[sjjsgs. (However, note that if S* is a group, 
then s = [sJ e CS* for all s G S, so group algebras only have one natural basis). 

Our goal is to construct an efficient change of basis from the semigroup basis of 
CS to a Fourier basis. We will do so by constructing an efficient change of basis 
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from the semigroup basis of CS to the groupoid basis of CS (that is, a fast zeta 
transform on S) and an efficient change of basis from the groupoid basis of C5 to a 
Fourier basis, so that the composition of these two changes of basis will be an FFT 
for S. We will focus on the second of these changes of basis first. In order to do so, 
we must first understand how the groupoid basis realizes the decomposition of CS* 
into a direct sum of matrix algebras over group algebras. 

4.2. Matrix Algebras Over Group Algebras. Given an element s S 5*, it is 
natural to think of s as an "isomorphism" from dom(s) to ran(s). As in [23] . we 
use this to define the notion of isomorphic idempotents. 

Definition 4.4. If a, 6 G 5 are idempotent, then a and b are said to be isomorphic 
idempotents if there is an "isomorphism" from a to b, that is, if there is an element 
s G S' such that a = s~^s and b = ss^^. 

Let us now define two idempotents in S to be V-related if they are isomorphic. 
For the rook monoid Rn, the idempotents are the restrictions of the identity map, 
and two idempotents are isomorphic if and only if they have the same rank. We 
extend "D to an equivalence relation on S by defining s D t if s~^s is isomorphic 
to t~^t (or, equivalently, if ss^^ is isomorphic to tt~^). This is Green's famous 
P- relation [3l [7] , and the equivalence classes of 5* with respect to this relation are 
called the P-classes of 5*. We mention that an equivalent characterization of V is 
that s V t if and only if s and t generate the same two-sided ideal in S. For i?„, 
there are n+1 P-classes. They are Dq, Di, . . . , D„, where Dk is the set of elements 
of Rn of rank k. 

Let e £ S he idempotent. Let Ge be the maximal subgroup at e, that is, the 
largest subset of S which contains e and which is also a group. It is easy to see that 

Ge ~ {s £ S : s^^s — ss^^ — e}, 

and that e is the identity of Gg. If a, b are isomorphic idempotents, it is straight- 
forward to show that Ga = Gb- For i?,„, the maximal subgroup at an idempotent e 
of rank k is isomorphic to the permutation group Sk- 

Now, let us describe the decomposition of the semigroup algebra CS* into a 
direct sum of matrix algebras over group algebras. This is B. Steinberg's result |231 
Theorem 4.5] , and we include the proof because the construction of the isomorphism 
involved is important in the construction of the FFTs to come. Let Dq, . . . , Z?„ be 
the I?-classes of S. Let CDk be the C-span of {[sj : s E Dk}- It is immediate from 
© that C5-0Lo C^fc- 

Theorem 4.5 (B. Steinberg). Let indicate the number of idempotents in Dk, 
and let Ck be any idempotent in Dk- Denote the maximal subgroup of S at Ck by 
Gk- Then, as algebras, CDk ^ Mr^{CGk). 

Proof. We already know that Ga and Gb are isomorphic for any idempotents 
a,b€ Dk- Now, fix an idempotent £ Dk, and for every idempotent a S Dk, 
fix an element pa £ S such that Pa'^^Pa = and PaPa ^^ — a (that is, Pa is an 
isomorphism from to a). Let us take Pe^. = e^. It is easy to show that, in fact. 
Pa G Dk ■ We view our rk x matrices as being indexed by pairs of idempotents 
in Dk- We now define our isomorphism by defining it on the basis {[sJ : s G Dk} 
of CDk and extending linearly. So, for an element [sJ e CDk with s~^s — a and 
ss~^ = b, define 

(t>{ls\) = Pb^^SPaEb^a, 
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where Eb^a is the standard rk x rk matrix with a 1 in the 6, a position and 
elsewhere. 

A quick calculation shows that Pb~^spa G Gk by construction. It is straightfor- 
ward to show that (f> is an isomorphism, with the inverse induced by, for s € Gk , 

□ 

The corollary is: 
Corollary 4.6. CS ^ ©^^^ M,, {CGk). 
A dimensionality count thus establishes 

n 

(5) \S\=Y.^l\Gk\. 

Since we will construct an FFT for the rook monoid i?„ in Section [H for clarity's 
sake we now explain what the isomorphism constructed in the proof of Theorem 
14.51 translates into when 5* — Rn- Fix a I?-class (that is, the subset of elements 
of Rn of rank fc), and let us take Ck G Dk to be the partial identity on {1, . . . , /c}, 
that is 

f I 2 ••• k k+l ■■■ n \ 
^'^=[1 2 ■■■ k - ... - j- 

We then have 

Gk = {s e Rn ■■ dom(s) = ran(s) = {1, . . . , k}}. 

Let us identify Gk with the permutation group Sk in the obvious manner. 

For an idempotent a € Dk (that is, a rank-fc restriction of the identity map), let 
us take Pa to be the unique order-preserving bijection from {1, . . . , fc} to dom(a) = ran(a). 
For an element s G Rn of rank k, let us define the permutation type of s, perm(s), 
to be, informally, the "arrows" from dom(s) to ran(s), expressed as a permutation 
in Gk = Sk- For example, if 

^ = ( 4 - 1 2 ) ' ^^'^ P'^™^') = ( 3 1 2 ) 

because s sends the first element of its domain to the third element of its range, 
the second element of its domain to the first element of its range, and the third 
element of its domain to the second element of its range. 
Formally, we define 

perm(s) = Pran(s)"^SPdom(s), 

where dom(s) and ran(s) are once again understood to be the corresponding par- 
tial identities in i?„, so that Pdom(s) is the unique order preserving bijection from 
{1, . . . , fc} to dom(s) and Pian(s) ^^ is the unique order preserving bijection from 
ran(s) to {1, . . . , k}. 

The isomorphism (p defined in the proof of Theorem 14.51 now works as follows. 
We have (^) x (^) matrices, so let us index their rows and columns by the fc-subsets 
of {1, ... , n}. We have 

Ci?fe^M(„)(C5fe) 
by H[s\) = perm(s)i;ra„(s), dom( s) ■ 

Therefore, we have: 
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Corollary 4.7. Ci?„ ^ 0Lo ^^C') (^^fe)- 

Remark: This result was implicit in the work of Munn |17j and was first written 
down explicitly by Solomon [21] . Solomon's isomorphism is essentially the same as 
the one given here. 

4.3. FFTs for Inverse Semigroups. We can now give a bound on the number 
of operations needed to change from the groupoid basis of CS to a Fourier basis of 
CS. 

Theorem 4.8. Let S be a finite inverse semigroup with V-classes Do, . . . , Z3„. For 
each V-class Dk, choose an idempotent e^, and let Gk be the maximal subgroup of 
S at Ck- Let r^ denote the number of idempotents in Dk- Let v G <CS be given with 
respect to the groupoid basis, that is 

V = ^w(s) [s\ . 

Then the number of operations needed to compute the Fourier transform of v is no 
more than 

n 

Proof. For each idempotent a G Dk, fix an element Pa & S such that Pa^Pa ~ Sk 
and PaPa^ = (and take Pe^ = e^). By the proof of Theorem 14. 5| this defines the 
isomorphism 

n 

C5-0M,,(CG,-)- 

fc=0 

Let IRR(Gfc) be any set of inequivalent, irreducible matrix representations of CGfc. 
It is clear from this isomorphism that the irreducible representations of <CS are in 
one-to-one correspondence with I+J^'^q IRR-(Gfe). Specifically, given a representation 
p of CGfc, we tensor it up to a representation of M^^iCGk) and extend to CS 
by letting it be zero on the other summands and following the isomorphism. The 
resulting representation p of CS is thus given by the linear extension of 

P{Pss-^^^SPs-isEss-i^s-Ks) = Ess-i,s-^s ® P{Pss-^~^SPs-is) if S € Dk, 

otherwise. 



p(W) 



Furthermore, the collection y ~ {p : p l±)fc=o ^-f^^^C^fc)} forms a complete set 
of inequivalent, irreducible matrix representations of CS. We will use this set of 
representations to obtain our bound. Let p € y, and suppose p was obtained by 
tensoring up a representation p in IRR(Gfc). Then 

= = ^w(s)p([sj) = v{s)p{[s\), 

the last equality arising from the fact that p is identically zero off of CDk. Now, 
let us view p{v) as an rk x matrix with entries in dp x dp matrices (so we are 
viewing the rows and columns of p{v) as indexed by the idempotents in Dk). For 
idempotents a, 6 G Dk, the b, a entry of p{v) is then the dp x dp matrix 

P{v)bM = X! v{s)p{pb^'^spa). 
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By the proof of Theorem 14.51 this is the same as 

(6) ^ v{pbspa'''^)p{s). 

seGk 

If we define a function hh^a ■ Gk — C by 

hb,a{^) = v{pbSPa^^), 

we see that (|6]) is just the Fourier transform of the function hb^a on the group Gk 
at p. Notice that this holds regardless of the choice of IRR(G/c). Furthermore, 
once IRR(G'fc) is chosen, it is clear from the above argument that the collection 
{hb,a{p) ■ P & IRR(Gfe)} consists exactly of the blocks that compose {v{p) : p G 
IRR(Gfc)}. An algorithm for computing the Fourier transform of v thus presents 
itself — for each 2?-class Dk, run Fourier transforms on Gk, and then arrange the 
results into block form to construct the v{p). The latter step can be done for free 
in our computational model because it requires no operations. Thus, the number 
of operations required to compute the Fourier transform of v with respect to y is 
no more than 

n 

^rlTiB,R(Gk)iGk)- 

k=Q 

Since we can choose the IRR(Gfe) at will, choosing them to be in their most compu- 
tationally advantageous forms for computing Fourier transforms on the Gk reduces 
our bound on the number of operations needed to compute the Fourier transform 
of V by this approach to 

n 

T.'^lCiGk). 

k=Q 

□ 

Let us denote by C{Cs) the maximum number of operations to perform a zeta 
transform on CS (that is, the maximum number of operations needed to re-express 
an arbitrary element of CS* with respect to the groupoid basis of <CS, given its 
expression in terms of the semigroup basis). Our main result follows immediately. 

Theorem 4.9. Let S be a finite inverse semigroup with V-classes Dq, . . . , Z3„. For 
each V-class Dk, choose an idempotent Ck, and let Gk he the maximal subgroup of 
S at Ck- Let r^ denote the number of idempotents in Dk- Then 

n 

C{S)<C{Cs) + Y.^lC{Gk). 

k=Q 

The following corollary concerns the case when "good" FFTs (that is, c\G\ log'' |G|- 
complcxity FFTs) are known for every maximal subgroup G of S. 

Corollary 4.10. Suppose S is a finite inverse semigroup with T)- classes Dq, . . . , £)„. 

Let rk be the number of idempotents in Dk- Choose an idempotent Ck from each 
V-class Dk, and let Gk be the maximal subgroup at Ck- IfC{Gk) < Cfe|Gfc| log'''" \Gk\ 
for all k, then for some constants c, d we have 

C(5)<C(Cs) + c|5|log''|5|. 
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Proof. Let c = max^^p Ck , and let d — max^'^^Q dk ■ Then 

n 

n 

<C{(:s) + clog''\S\J2rl\G,\ 

k=0 

= CiCs) + c\S\log''\Sl 

the last equality arising from ((S]). □ 

Thus, the problem of constructing a fast Fourier transform for a finite inverse 
semigroup S may be solved by constructing a fast zeta transform on the poset 
structure of S and constructing fast Fourier transforms for all of the maximal 
subgroups of S — in particular, an 0{\S\ log'' \S\) zeta transform for S together with 
0{\G\ log'' \G\) FFTs for each of the maximal sub groups G of S* combine to form 
an 0(|S'|log''|S'|) FFT for S. 

Remark: Note that the approach outlined in this section does not make use of the 
assumption that a complete set of representations of CS is precomputed and stored 
in memory, and in fact full precomputation is unnecessary. To use the approach 
presented here, we only need to precompute a complete set of representations for 
each of the maximal subgroups Gk of S (the cost of which is handled in the C{Gk) 
terms in Theorem 14. 9p . Equivalently, once we have chosen one idempotent from 
each P-class Dk of S, we only need to precompute the set of representations y used 
in the proof of Theorem 14.81 on the following subset of the groupoid basis of CS: 

n 

\J{[s\:seGk}. 

k=Q 

This amounts to precomputing y on the full groupoid basis, as the structure of 
p{[s\ ) can then be inferred for any p £ y and any other groupoid basis element [s\ 
as a simple permutation of blocks of one of the matrices already computed. 
Remark: The problem of creating a fast zeta transform on the poset structure 
of a finite inverse semigroup appears difficult to tackle in generality because the 
poset structure can be about as bad as one wants — every finite meet semilattice is 
a (commutative and idempotent) finite inverse semigroup under the meet opera- 
tion, so it is at least possible to encounter any finite meet semilattice as the poset 
structure of an inverse semigroup. It remains to be seen whether there are any 
general principles one might employ when creating fast zeta transforms. In the 
next two sections, we develop specific fast zeta transforms for the poset structures 
of the rook monoid and its wreath product by arbitrary finite groups and combine 
them with known results on group FFTs for the symmetric group and its wreath 
products to obtain 0{\S\ log'' IS"]) FFTs for these inverse semigroups. 

5. An FFT for the Rook Monoid 

We now use the approach from Section|3]to create an O ( | i?„ | log^ | i?„ | )-complexity 
Fourier transform for the rook monoid We begin by handling the term in Theo- 
rem |421 concerning the change of basis from the groupoid basis of Ci?„ to a Fourier 
basis. 

Theorem 5.1. C(i?„) < C{CrJ + |"(" - l)\Rn\. 
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Proof. Let Do, Di, . . . , Dn be the I?-classes of i?„. Recall that Dk is the elements of 
Rn of rank k. Let Ck G Dk be the partial identity on {1, 2, . . . , fc}, so the maximal 
subgroup Gk of Rn at is isomorphic to Sk- 
Corollary 14.71 and Theorem 14.91 then imply that 

n / \ 2 

C{Rn)<C{CHj+Y.r) C{Sk). 

k=0 ^ ^ 

If we let yk denote a complete set of irreducible representations for Sk given in 
Young's seminormal or orthogonal form (descriptions of which may be found in [2] 
or Chapter 3 of 0), then we may use Maslen's FFT for the symmetric group [12] 
to obtain C{Sk) < TyJSk) < - l)\Sk\. 

From here, we have 

EU) ^(^^)^EU) ^(^- 1)1^.1 

k=0 ^ ^ fc=0 ^ ' 

„ n / \ 2 




where the last equality follows from Thcorcm l2.5l The theorem follows. 

□ 

We now turn to analyzing C(Cj?„). Let / £ Ci?„ be an arbitrary element, ex- 
pressed with respect to the semigroup basis, that is 

/ = E /(^)^- 

We would like to express / with respect to the groupoid basis, that is 

/ = E w , 

where, by ([3|), the coefficients g(s) are given by 

9{s) = fit). 

t>s 

Our goal is to compute the coefficients g{s) in an efficient manner, and we give an 
algorithm below for doing so. First, however, we present the proof of Theorem 12. 61 
as the algorithm we give below is based (at least in part) on the ideas involved in 
the proof. 

Theorem 5.2 (Theorem [2J]). For n > 3, 

|i?„| = 2n|i?„_i|-(n-l)2|i?„_2|. 

Proof. Viewing the elements of i?„ as rook matrices, Rn consists of those elements 
having all O's in column 1 and row 1 (of which there are together with, for 

each a e {1, . . . , n}, those having a 1 in position (a, 1) (of which there are n|i?n-i| 



16 



MARTIN E. MALANDRO 



total), together with, for each a G {2, . . . , those having a 1 in position (1, a) (of 
which there are (n — l)|i?„_i| total). Counting the number of elements of i?„ in 
this way overcounts. For each pair a, /3 with 2 < a, (3 < n, every element with ones 
in positions (a, 1) and (1, P) (of which there are [n — l)^|i?„_2| total) gets counted 
twice. □ 



We now explain the fast zeta transform, noting that the savings in time afforded 
by this algorithm come at the expense of a modest additional storage requirement 
over the naive algorithm — the algorithm presented here requires the storage of up 
to 0(ri^/^|i?„|) complex numbers in memory during runtime (see Theorems 15 . 61 and 
I5.7p . as opposed to the naive algorithm, which requires at most 2|_R„|. 

Let us denote g{s) = J2t>s fi^) by Cfi^)- The basic idea is to "work from the 
top down." Since we are trying to compute C/(*) for all s € Rn, it makes sense to 
begin with the elements of rank n. If rk(s) — n, then there is no element t such 
that t > s, so C/('S) — /(s), and this requires no operations. Next, if rk(s) = n — 1, 
then there is only one element t e i?„ such that t > s, so 

C/(s) = /(s) + /(0-/(s) + C/(0- 

Next, if rk(s) = n — 2, consider the poset consisting of the elements t e i?„ for 
which t > s. This poset is isomorphic to the poset for i?2, with s in the place of the 
element. We proceed down in rank in this manner, and the aim of this fast zeta 
transform is, in general, to find a way to re-use the Cf{t) we have already computed 
in order to efhciently compute Cfi^). In fact, instead of computing just Cfi^) fo^' 
the elements s of rank k, we compute C/(s) along with n—k other numbers for each 
element s of rank k. These other numbers are needed for the efficient computation 
of the zeta transform at elements of rank k — 1 and fc — 2, and can be discarded 
when they are no longer needed. We introduce some notation. 
Let s £ Rn- Then s is a partial permutation of {1, 2, . . . , n}. 

• Let di{s) be the z*'' element of {1, 2, . . . , n} not in dom(s). 

• Let ri{s) be the z*'' element of {1, 2, . . . ,n} not in ran(s). 

That is, di{s) is simply the i^^ element of the complement of the domain of s (taken 
in order), and similarly for ri(s). Define "partial" zeta transforms at s as follows: 

C/(s,{di(s),d2(s),---,rf™(s)},{ri(s),r2(s),...,r,„(s)}) = ^ f{t) 

t>s: 

tii(s),...,dm(s)^dom(t) 
ri (s), . . .,r^ (s)^ran{i) 

Our zeta transform proceeds as follows, with steps 0, 1, . . . , n: 

• Step 0: For all s e i?„ with rk(s) = n, compute all Cf{s, {}, {}) = Cf{s) (0 
operations). 

• Step 1: For all s S i?„ with rk(s) = n — 1, compute C/(Sj{}i{}) — C/(s) 
and (f{s,{di{s),ri{s)}) (1 operation for each element s). 
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• Step n — k: For all s G Rn with rk(s) = k, compute all 

C/(5,{},{}) = C/W, 

C/(s,{di(s)},{ri(s)}), 

C/(s, {di{s), d2(s)}, {ri(s), r2(s)}), 

C/(s, {di{s), d2{s),..., dn-k{s)}, {ri{s), r2(s), . . . , r„-k{s)}). 



Theorem 5.3. Step n — k requires at most 
operations in total. 

Proof. We will show that, for an element s G Rn with rk(s) = fc, computing all 

C/(«,{},{}) = C/(«), 

C/(s,{rfi(s)},{ri(s)}), 

C/(s, {rfi(s), d2{s)}, {ri(s), r2(s)}), 

C/(s, {di(s), d2{s), dn-k{s)}, {ri(s), r2(s), . . . , r„_fc(s)}) 
requires at most 

,,2 (n - k - l)(n - k)(2n - 2k - 1) 

D 

additions, assuming that steps 0, 1, . . . , n — fc — 1 have already been completed. 

Let s* {di{s) rj{s)) denote the element of i?„ that is obtained by adding di{s) 
to the domain of s and sending it to rj{s). For example, if 

1 2 3 4 5 6 7 

2 - 5 - - - 3 



then 

s* {d2{s) r3(s)) = 



1 2 3 4 5 6 7 

2 - 5 6 - - 3 



Now, consider the poset of elements t G Rn with t > s. This poset is isomorphic 
to the poset for Rn-k, with s in the place of the element. Following the idea in 
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the proof of Theorem 15.21 we have: 

C/(s,{},{}) = C/(.s*(di(.)-ri (.)),{},{}) 

+ C/(s*(d2(s)^ri(s)), {},{}) + ... 
+ C/(s*K-fe(s)^ri(s)), {},{}) 

+ Cf{s*{d,{s)^r2{s)), {}, {}) + ■■■ 
+ C/(s*(di(s)^r„_fe(s)), {},{}) 

- E Cfi^*id^{s)^n{s))*idlis)^r,is)), {},{}) 

i,j€{2,...,n-k} 

+ Cf{s,{di{s)}Ari{s)}). 

Notice that every term in this sum, with the exception of C/(sj {rfi(s)}, {''i(s)}), 
was computed in an earher step. After all, rk(s * {di{s) rjls))) > rk(s) for all 
i,j. Thus, once we have (f{s, {di{s)}, {ri{s)}), all we have to do to compute Cfi^) 
is add these terms up. To compute Cf{s, {di(s)}, {^1(5)}), we again follow the idea 
in the proof of Theorem 15.21 and we write: 

C/(s, {n{s)}) = C/(s * (d2(s) ^ r2(s)), {n{s)}) 

+ Cfis * (dsis) ^ r^is)), {d,is)}, {n{s)}) + ■■■ 
+ C/(s * (d„_fe(s) ^ r2{s)),{di{s)}, {n{s)}) 
+ Cf{s* {d^{s) ^ r,{s)), {ri(s)}) + • • • 

+ C/(s * {d2{s) r„_fc(s)), {ri(s)}) 

2,j£{3,...,n— /c} 

(d2(s)-.r,(s)),{di(s)},{ri(s)}) 
+ C/(s, d2(s)}, {ri(s), r2(s)}). 

Notice that z, j > 2 implies 

C/(s*(d,(s)^r,(s)),{di(s)},{ri(s)}) 

= C/(s * ^ r,(s)), * ^ r,(s)))}, {ri(s * (d,(,s) ^ r,(s)))}), 

so every term in this sum, with the exception of C/l*, {di{s), ^2(5)}, {ri(s), r2(s)}), 
was computed in an earlier step. 

In general, suppose D = {di(s), ^2(5), . . .,dm{s)}, R = {ri(s), r2(s), . . . ,r™(s)}. 
If m = ri — fc, then we have 

Cf{s,D,R) = f{s). 

li m = n — k — 1, then we have 



C/(s, D, R) = C/(s * {dn-k{s) ^ r„_fc(s)), Z?, i?) 

+ C/(s,i^U {d„_fc(s)},i?U {r„_fc(s)}). 
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Otherwise, m < n — k — I, and wc have 

C/(s, D,R) = Cfis* (d^+iis) ^ rra+i{s)),D, R) 

+ Cf{s* {dm+2{s) rm+i{s)),D,R) H 

+ C/(s* {dn-k{s) rm+i{s)),D,R) 
(7) + C/(s * {dm+1 (s) ^ rm+2{s)), D,R) + --- 

+ Cf{s* {dm+i{s) rn-k{s)),D,R) 

i ,j G {m-\-2 , . . . .n— k} 

{d„i+i{s) rj{s)),D,R) 
+ Cf{s, D U R U {rm+i{s)}), 

where every term in the sum, with the exception of 

C/(s, D U {d„+i(s)}, R U {r,„+i(,s)}), 

was computed in an ear her step of the algorithm. Once we have computed C/(s, -DU 
{dm+i{s)}, RU{rm+i (s)}), the number of operations required to compute (f{s,D,R) 
is thus no more than 

{n — k — ni) + {n — k — m — l) + {n — k — m — 1)^. 

We do this for m from n — A; to to compute, in order: 

Cf{s, {di{s), d2{s), dn-k{s)}, {ri{s), r2(s), . . . , r„_fe(s)}), 



Cf{s,{d,{s),d2{s)},{nis),r2{s)}), 
Cfis,{di{s)},{riis)}), 
C/(^,{},{}) = C/(5), 

which is what we want. The total number of operations required is thus no more 
than 

n—k 

{n — k — m) + {n — k — m — 1) + {n — k — m — 1)'^ 



m=0 



(n-k-l)(n-k)(2n-2k-l) 
(n - kf + ^ '-^ '-. 



This algorithm yields the bound 
Theorem 5.4. For n>3, C{CrJ < ln^\Rn\- 
Proof. Using the algorithm given above, we have 

c«^) < t. ((" - >=r + (-'•-')c-;)(^»-^*-^) ) (»)\! 



□ 
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When n > 3 



(n - k)^ + 



{n-k - l)(n - k){2n - 2fc - 1) 
6 




in which case we have 




n^\Rn\- 



□ 



Combining this fast zeta transform with Theorem 15.11 we obtain 
Theorem 5.5. C(i?„) = 0(|i?„|log^ |i?„|). 



Proof. For n > 3, we have 




Since |i?„| > nl and n — 0(log(n!)), we are done. 



□ 



Remark: This result considerably improves the bound on C{Rn) we obtained in 
Theorem 8.2], where we used a naive implementation of the zeta transform on 
Rn to prove 



We now turn to an analysis of the memory required to use the fast zeta transform. 
Since we compute n — k + 1 complex numbers for each element of rank fc, it is 
immediate that we need to store no more than {n + complex numbers during 

runtime. However, by ([7|, we only need partial zeta transforms of elements of rank 
k + 1 and A: + 2 to compute the partial zeta transforms for an element of rank fc, 
so when we begin step n — k of our algorithm (compute all partial zeta transforms 
for the elements of rank k) we may discard the partial zeta transforms Q(s, A, B) 
for A,B^{} and rk(s) > k + 2. 

Storing the inputs (the /(s)) and the outputs (the C/(*)) of the zeta transform 
requires the storage of 2|i?„| complex numbers. At the beginning of step n ~ k, let 
us allocate memory for all of the partial zeta transforms of the elements of rank k 
and discard the partial zeta transforms of rank fc + 3 and higher that we no longer 
need. Note that for any s S i?„, one of the (f{s,A,B) that we compute is just 
/(s) and another is Cfi^), so at the beginning of step n — k we allocate memory for 
{n — k — l){^^k\ complex numbers that will be discarded later (when k — n we 
interpret this to be 0). 

Theorem 5.6. This fast zeta transform requires the storage of no more than 
0{n^^^\Rn\) complex numbers in memory at any point during its execution. 

Proof. At the beginning of step n — fc we allocate memory for {n-k~l){2) k\ 
complex numbers that will be discarded later. We continue to store all of the partial 
zeta transforms of the elements only of rank fc + 1 and fc + 2 as well, so at no point 
will we ever need to store more than 
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complex numbers in memory during the execution of the fast zeta transform. We 
claim 



max 
fce{o,i,...,n-i} 



(n-fc-l)Q fc! = 0(ni/4|i?„|). 



To see this, write k = n — X\pn for some < x < \/n. Then using Stirling's 
approximation we have 



(n-fc - 1) , k\< 



n\ (^^ ~ /c)n!n! 



k) ' - k\(n - k)\{n ^ k)\ 

T . rrtTt 'n ' 



< 



(n — X\fn)\{x\fn)'^? 

n\n\ 



(8) 



(n - a;V^)!27r(a;V^/e)2^v^e2/(i2^v^+i) 



2nl/40Fi [n - Xy/^)\y/^^''^ x'^x^^2^ g2/(122;v^+l) 

By [ni Theorem 1], we have that |i?„| is asymptotically rde^^ / {2n^/'^^/¥e), so 
for e > let n be large enough so that nle^^ / {2n^ / ^ y/Tre) < (l + e)|i?„|. All of the 
terms in the product ([5]) are nonnegative, and we show below that the final three 
terms are each bounded above by 1. From this it follows that for n large enough, 
for all k we have 

(n - fc - 1) f^Vfc! < + 



so for n large enough, for all k we have 



(n-fc-l)Q fc!<ni/4|i?„|, 



and thus for n large enough the number of complex numbers we need to store in 
memory during the execution of the fast zeta transform is bounded by 2|i?„| + 
3ni/4|i?„|. 

To finish the proof, then, we show that the final three terms of the product ([5]) 
are bounded above by 1. First, 

f n\ \ 

log — = = log(n) + log(rt - 1) H h log(7i - x^/n + 1) - x\/nlog{n) 

\{n- Xy/ny.y/n ^ "y 

< x^/nlog{n) — x\/n\og{n) — 0, 

so 

n! 

< 1. 



{n — X\pn)\^fn'^^ 
Next, for fixed n and x £ (0, \/n\, elementary calculus shows that 

is increasing for x £ (0, 1] and decreasing for x G [1, \/n\. Hence f{x) is maximized 
at a; = 1, where it has value e^^. Thus 

^ < 1- 
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Finally, it is immediate that 

g2/(12xVH+l) - ^' 

which completes the proof. □ 



Next, we show that the bound given in Theorem l5.6l is. up to O, the best possible. 

Theorem 5.7. There exist infinitely many n for which this fast zeta transform 
requires the storage of at least complex numbers in memory at some 

point during its execution. 

Proof. Let n be a square. Proceeding rank by rank, at some point we will have all 
of the partial zeta transforms of all of the elements of rank n — y/n in memory. At 
this point, we will be storing 

2 



complex numbers which we intend to discard later. Now, using Stirling's approxi- 
mation we have 



(V^-i)(_"jV-V^)! = 



> 



(9) 



{\/n — l)n!n! 

{^/n — l)n!n! 
[n - V^)!27r0i(V7I/e)2v^ei/(6v^) 



By in Theorem 1], we have that is asymptotically n!e^^/(2n^/''-\/7re), so 
for e > let n be large enough so that: 

• n!e2v^/(2ni/4V^) > (1 - e)|i?„|, 

• (1 - > (1 - e) • 1/e, 

• {y/n — 1)/ y/n > 1 — e, and 

• l/(ei/(6v^)) > 1-e. 
Note that 

log [ ^— - — —] ^ log(n) + log(n - 1) ^ h log(n - + 1) - \/nlog(n) 



{n — y/riy.n^ 



> y/n\og{n - y/n) - y/n\og{n) = log {{^ " 



so 

> ( 1- ^ 



(n — 

Combining this with the fact that all of the terms in the product ^ are nonnegative, 
we obtain 

/ T) \ 71 I ^ . 

(n-VH)!>^.(l-e)4|i?„|, 
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SO for all squares n large enough we have 

iV^-l)( (n-V^)!>y/4|i?„|. 

□ 

6. FFTs FOR Rook Wreath Products 

Let G be a finite group. The rook wreath product G I Rn is the semigroup of all 
n X n matrices with entries in {0} U G having at most one non-zero entry per row 
and column under the operation of matrix multiplication. Let us write 1 for the 
identity of G. Clearly, then, we recover the rook monoid i?„ as Zi ; i?„. It is easy 
to see that the idempotents of GlRn are precisely the idempotents of and that 
G I Rn is an inverse semigroup. 

In this section we again use the approach detailed in Section |4] to create an 
0{\GlRn \ log* I G^i^n I )-complexity FFT for GlRn- We begin by extending notions 
about the rook monoid to G ; i?„ and recording a number of facts about G ; Rn- 

6.1. Facts about GlRn- 

Definition 6.1. For an element s E G I Rn, the rank of s, denoted rk(s), is the 
number of rows of s which contain nonzero entries. 

Equivalently, rk(s) is the number of columns of s which contain nonzero entries. 

Definition 6.2. The symmetric group wreath product G I Sn is the group of all 
n X n matrices with entries in {0} U G having exactly one non-zero entry per row 
and column. The operation on G ; Sn is matrix multiplication. 

Thus G ; Sn is contained in G I Rn as the rank-n elements. 

We generalize the notion of domain and range from i?„ to G I Rn as follows. 

Definition 6.3. Let s G G I Rn- Define dom(s) to be the set of indices of the 
columns of s which contain nonzero entries, and ran(s) to be the set of indices of 
the rows of s which contain nonzero entries. 

This definition agrees with our previous definitions of inverse semigroup domain 
and range, that is, 

dom(s) = s~^s, ran(s) = ss~^, 

provided that we once again abuse the distinction between the domain and range 
of a map and the corresponding partial identities (as elements of Rn)- 

We must understand the maximal subgroups of G I Rn- Let e E G I Rn be 
idempotent, with rk(e) = k- 

Theorem 6.4. The maximal subgroup of G I Rn at e is isomorphic to Gl Sk- 
Proof- Denote this subgroup by Ge- We have 

Ge ^ {s E GlRn ■- SS'^ = S^^S = c} . 

Suppose that dom(e) — {ii, . . . , ik} (and hence ran(e) — {ii, . . . , ik}, because e is 
idempotent). Clearly, then, 

{x E G I Rn ■ dom(a;) — ran(a:) — {ii, - - - , ik}} C Ge- 
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If a; e Ge, then doni(a::) — ran(x). Furthermore, if rk(a;) =/= k, then x ^ Ge- If 
j G doni(a;) with j ^ {ii, . . . , ik}, then a;^^a; e, and so x ^ Ge- Thus 

Ge = {x e G ? i?„ : dom(x) = ran(a;) — {ii, . . . , ik}}, 

which is isomorphic to G ; Sk in the obvious way (i.e., for x Cz Ge, delete the rows 
and columns of x which contain only zeroes). □ 

We will also need to understand the poset structure of G I i?„. 

Theorem 6.5. Let s,t d G I Rn- Then s < t if and only if s may be obtained by 
replacing entries in t with 0. 

Proof. This follows directly from the definition 

s <t <=^ s = et for some idempotent e £ G I Rn 

together with the fact that the idempotents of G I Rn are the idempotents of i?„ 
(i.e., the restrictions of the identity matrix). □ 

Finally, we record the size of G I Rn ■ 
Theorem 6.6. 



fc=0 ^ ^ 

Proof. There are (^)^fc! rook matrices of rank k, and for a given rook matrix X of 
rank fc, there are |G| options to replace each of the I's in X with elements of G. □ 

6.2. The FFT for G i i?„. We now explain our FFT for G I Rn- We begin by 
handling the term in Theorem 14.91 concerning the change of basis from the groupoid 
basis oi GlRn to a Fourier basis. We write C[GlRn] and C[GlSk] for the complex 
algebras of rook monoid and symmetric group wreath products. Theorem 14.51 and 
the discussion in Section 16.11 imply that we have 



C[GlRn]=^M^^^{C[GlSk]). 
Theorem 14.91 then applies and yields 



2 

n ^ 



k=0 



C(G;i?„) <C(CG;flJ + 2^(^^j CiGlSk). 

In [19] . D. Rockmore constructs a complete set of inequivalent, irreducible rep- 
resentations yk for C[G I Sk] and proves the following [111 Corollary 2]. 

Theorem 6.7. Let h denote the number of inequivalent, irreducible representations 
of G. Then 

. . . , , „ , . ^ . , . >2,-i, _|_ T ^,2 

Ty,iGiSk)<kl\G\' 



C{G) fc(fc + l) , ^H k\k + lf ■ 
|G| ■ 2 +^ 4 +^ 



In particular, C(G I Sk) is bounded by the same amount. We take the yk and 
tensor them up to be our set of representations y for C[G I Rn]- 

Theorem 6.8. We have 

C{G I Rn) = CiCawJ + 0{\G I i?„| log^ |G ; i?„|). 
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Proof. We have 



n / \ 2 

Ty{G I Rn) < CiComJ + E [k) 
<C{Cg>rJ + 

<CiCG>Rj + 



c{G) fc(fc + i) , ,^ kHk + if , ; 

|G| ■ 2 +^ 4 



|G| ■ 2 +^ 4 +^ 
C(G) n (n + l) ^ ^j ^jn + lf ^ ^ 



k=t) 



\G\ 



\GlRn\. 



Now, |G|, C(G), and 2'* are constants with respect to n, and n — 0(log |G ? -R„|). 
The theorem fohows. □ 

Now, let / G C[G ? Rn] be an arbitrary element, expressed with respect to the 
semigroup basis: 

We would like to express / with respect to the groupoid basis: 



/= E 9{s)Vs\ 



where, by ([3]), the coefficients g{s) are given by 

E /w- 

t>s 

Our goal is to compute the coefficients g{s) in an efficient manner, and we give 
an algorithm below for doing so. Note that the algorithm below reduces to the 
algorithm for the fast zeta transform for i?„ given in Section O when G = Zi . As 
in Section O we begin by proving a recursive formula for the size of G I Rn ■ 

Theorem 6.9. For n > 3, 

\G I i?„| = {2n - 1)|G||G ; i?„_i| + |G ; - (n - l)2|Gp|G ; i?„_2|. 

Proof. G I Rn consists of those elements having all O's in column 1 and row 1 (of 
which there are |G;i?„_i|), together with, for each x £ G and a & {1, ... ,n}, those 
having an x in position (a, 1) (of which there are n|G||G I Rn-i\ total), together 
with, for each x £ G and a £ {2, . . . , n}, those having an x in position (1, a) (of 
which there are {n — 1)\G\\G I Rn-i\ total). Counting the number of elements of 
G ; Rn in this way overcounts. For each pair a, (3 with 2 < a, jS < n and for each 
pair of elements x,y £ G, every element with x position (a, 1) and y in position 
(1, f3) (of which there are {n — l)^|Gp|G I i?„-2| total) gets counted twice. □ 

We now explain the fast zeta transform. As in Section[5l let us denote J2t>s /(^) 
by Cf{s). Let s £ Gl Rn- The rows and columns of s are indexed by {1,2,..., n}. 

• Let di{s) be the index of the i*^ column of s which contains only zeroes. 

• Let ri{s) be the index of the i*'* row of s which contains only zeroes. 
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Define "partial" zeta transforms at s as foUows: 

C/(s, {di{s), d2{s), dm{s)}, {ri(s), r2(s), . . . , r„(s)}) = 

E /(*)• 

t>s: 

columns di{s)^...,dmis) of t contain only zeroes and 
rows ri(s),...,rm(s) of t contain only zeroes 

As with Rn, we work from the "top" down, and our zeta transform proceeds as 
follows, with steps 0, 1, . . . , n: 

• Step 0: For all s G Gl Rn with rk(s) = n, compute all (/{s, {}, {}) ~ Cf{s) 
(0 operations). 

• Step 1: For all s G Gi i?„ with rk(s) = n — 1, compute C/(s, {}, {}) = Cf{s) 
and (f{s,{di{s),ri{s)}) (|G| operations for each element s). 



• Step n~ k: For all s e G i i?„ with rk(s) = k, compute all 

C/(s,{di(s)},{ri(s)}), 

C/(s, {di{s), d2{s)}, {ri(s), r^is)}), 

Cf{s, {di{s), d2{s), dn-k{s)}, {ri(s), r2(s), . . . , r„_fc(s)}). 



Thus, instead of computing just C/(*) for the elements s of rank fc, we compute 
C/(s) along with n — k other numbers for each element s of rank k. These other 
numbers are needed for the efficient computation of the zeta transform at elements 
of rank k— \ and fc— 2, and can be discarded when they are no longer needed. We are 
currently unable to give precise bounds on the amount of memory required for this 
algorithm, partly due to the lack of an asymptotic formula for \GlRn\- However, it 
is immediate that it requires the storage of no more than (n + 1)|G J i?„| complex 
numbers in memory during runtime. 

Theorem 6.10. Step n -~ k requires at most 

(|G|(n-fe)^ + |Gr("-^-^)("-;)(^"-^^-^)) Q\!|G|^ 
operations in total. 

Proof. We will show that, for an element s G GlRn with rk(s) = k, computing all 

Cf{s,{},{}) = Cf{s), 

C/(s,{di(s)},{ri(s)}), 

C/(s, {di{s), d2{s)}, {ri(s), r2(s)}). 



C/(s, {di{s), d2{s), dn-k{s)}, {ri{s),r2{s), r„_fc(s)}) 
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requires at most 

\G\in - kf + ^cj^.in-k- Din- mn- 2k-l) 

6 

additions, assuming that steps 0, 1, . . . , n — fc — 1 have already been completed. 

Suppose G — {gi,g2, • ■ ■ , 9\g\}- Let s*{gyErj(s},di{s}) denote the element of GlRn 
that is obtained by inserting gy into the rj{s), di(s) position of s. For example, if 
























9yi 






































9y3 





























9y2 

























































then 



di{s) = 2, ri(s) = l, 

d2(s) = 4, r2(,s)=4, 

d3(s) = 5, r3{s)^6, 

d^is) = 6, r4(s) = 7, 



and 
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* (.9y4Er-iis),d2{s)) 



Now, consider the poset of elements t E G I Rn with t > s. This poset is 
isomorphic to the poset for G I Rn-k, with s in the place of the matrix. 

As in Section[5l we compute our n—k partial zeta transforms at s in the following 
order. First, we have 

C/(s, {di{s), d2{s), . . . , dn-k{s)}, {ri{s),r2{s), r„„fe(s)}) = /(s), 

which requires no operations. Next, let D = {(ii(s), ^2(5), . . . , o?ri-fe-i(s)} and 
R = {ri{s),r2{s), . . . ,r„_fe_i(s)}. We have 

|G| 



Qis,D,R) ^J2(fi'"'(9^Er„-U'^),d,.-k{s)),D,R) 



+ C/(s, {di{s), d2{s), dn-k{s)}, {ri{s), r2{s), r„_fe(s)}). 



which requires \G\ operations. 
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Now, suppose D = {di(s), ^2(5), • ■ • ,rfm(s)} and R = {ri(s), r2(s), . . . ,rm(s)}, 
with m < rt — fc — 1. Following the proof of Theorem 16. 9( we have 

|G| 

Q{s,D,R) = ^C/(s* (5»^^r„+i(.),d„+i(.)),£',^) 
1=1 

|G| 



\G\ 

+ Yl "^/(^ * (5»^^r,„ + i(s),d„_,(s)), ^) 
i=l 

|G| 

i=l 

H 

|G| 

i=l 

X! Cf{s* {9kEr^^^(^s)M,{s)) * {giEr,(s)M^+i{s)),D,R) 

i.jtz {m+2 n— fc} 

fc,ie{l,2,...,|G|} 

+ Cf{s,DU{dm+i{s)},RU{rm+i{s)}). 
Computing C/(si ^) therefore requires no more than 

\G\{2n - 2fc - 2m - 1) + |G'p(n - fc - m - 1)^ 
operations. We do this for m = n — fc to to compute, in order, 

C/(s, {di{s), d2{s), . . . , dn-k{s)}, {ri{s), 7-2(5), . . . , r„_fc(s)}). 



C/(s, {di(s), d2(s)}, {ri(s), r2(s)}), 
C/(s,{di(s)},{ri(s)}), 
C/(^,{},{}) = C/(5), 

which is what we want. The total number of operations required to compute these 
is thus no more than 

ri-fe-2 

|G|+ Yj \G\{2n-2k-2m-l) + \Gf{n-k-m-lf 



m=0 



\G\in-ky + \G\ 



2 1^,2 - fc)(2"- 2fc - 1) 



6 

□ 



This algorithm yields the following bound. 
Theorem 6.11. C{Cg,rJ = Oi\G I R^llog" \G I R^\). 



FAST FOURIER TRANSFORMS FOR FINITE INVERSE SEMIGROUPS 



29 



Proof. For n > 3, 



\G\{n-kr + \G\ 



^{n-k - l)(n - k){2n - 2fc - 1) 
6 



<|G|n^ + |Gp^ 
< l\G\'n\ 



Hence 



C(CG;flJ< ^|G|V|G;i?„|. 



Since \G\ is a constant with respect to n and n = 0(log \G I Rn\), we are done. □ 

Combining Theorems 16.81 and 16.111 we obtain: 
Theorem 6.12. C{GlRn) = 0(|G ; i?„| log"^ |G ; i?„|). 
Proof. We have 



The generahzation of the theory of Fourier transforms to inverse semigroups 
and beyond presents a new set of interesting challenges. Theorem 14.91 opens the 
door for the development of more FFTs on inverse semigroups, as it reduces the 
problem of creating these Fourier transforms to the problems of creating FFTs on 
their maximal subgroups and creating fast zeta transforms on their poset structures. 
While the theory of group FFTs is well-developed, the theory of fast zeta transforms 
is not. An interesting line of research, then, would be to create a theory of fast zeta 
transforms for inverse semigroup posets. On the other hand, the poset structure of 
an inverse semigroup can be about as bad as one wants — any meet semilattice is 
possible. It remains to be seen whether there are any guiding principles one might 
employ when creating fast zeta transforms. 

We would also like to develop applications of these FFTs. In general, whereas 
groups capture global symmetries, inverse semigroups capture partial symmetries 
(see [TO for more on this idea). It is reasonable, then, that FFTs for certain inverse 
semigroups would be useful for data analysis in situations where FFTs on the 
analogous groups are useful. For example, the Fourier transform on the symmetric 
group has been used for the statistical analysis of voting data and Fourier 
transforms on symmetric group wreath products may be used for the statistical 
analysis of nested designs [TH]- We are currently investigating applications of the 
Fourier transform on the rook monoid to the statistical analysis of voting datasets 
which contain incomplete voter preferences. 
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